Search CORE

256 research outputs found

A multi-collection latent topic model for federated search

Author: Baillie Mark
Carman Mark
Crestani Fabio
Publication venue
Publication date: 18/06/2018
Field of study

Collection selection is a crucial function, central to the effectiveness and efficiency of a federated information retrieval system. A variety of solutions have been proposed for collection selection adapting proven techniques used in centralised retrieval. This paper defines a new approach to collection selection that models the topical distribution in each collection. We describe an extended version of latent Dirichletallocation that uses a hierarchical hyperprior to enable the different topical distributions found in each collection to be modelled. Under the model, resources are ranked based on the topical relationship between query and collection. By modelling collections in a low dimensional topic space, we can implicitly smooth their term-based characterisation with appropriate terms from topically related samples, thereby dealing with the problem of missing vocabulary within the samples. An important advantage of adopting this hierarchical model over current approaches is that the model generalises well to unseen documents given small samples of each collection. The latent structure of each collection can therefore be estimated well despite imperfect information for each collection such as sampled documents obtained through query-based sampling. Experiments demonstrate that this new, fully integrated topical model is more robust than current state of the art collection selection algorithm

RERO DOC Digital Library

Are Word Embedding-based Features Useful for Sarcasm Detection?

Author: Bhattacharyya Pushpak
Carman Mark
Joshi Aditya
Patel Kevin
Tripathi Vaibhav
Publication venue
Publication date: 01/01/2016
Field of study

This paper makes a simple increment to state-of-the-art in sarcasm detection research. Existing approaches are unable to capture subtle forms of context incongruity which lies at the heart of sarcasm. We explore if prior work can be enhanced using semantic similarity/discordance between word embeddings. We augment word embedding-based features to four feature sets reported in the past. We also experiment with four types of word embeddings. We observe an improvement in sarcasm detection, irrespective of the word embedding used or the original feature set to which our features are augmented. For example, this augmentation results in an improvement in F-score of around 4\% for three out of these four feature sets, and a minor degradation in case of the fourth, when Word2Vec embeddings are used. Finally, a comparison of the four embeddings shows that Word2Vec and dependency weight-based features outperform LSA and GloVe, in terms of their benefit to sarcasm detection.Comment: The paper will be presented at Conference on Empirical Methods in Natural Language Processing (EMNLP) 2016 in November 2016. http://www.emnlp2016.net

arXiv.org e-Print Archive

Crossref

Perception Visualization: Seeing Through the Eyes of a DNN

Author: Boracchi Giacomo
Carman Mark James
Giulivi Loris
Publication venue
Publication date: 01/01/2021
Field of study

Artificial intelligence (AI) systems power the world we live in. Deep neural networks (DNNs) are able to solve tasks in an ever-expanding landscape of scenarios, but our eagerness to apply these powerful models leads us to focus on their performance and deprioritises our ability to understand them. Current research in the field of explainable AI tries to bridge this gap by developing various perturbation or gradient-based explanation techniques. For images, these techniques fail to fully capture and convey the semantic information needed to elucidate why the model makes the predictions it does. In this work, we develop a new form of explanation that is radically different in nature from current explanation methods, such as Grad-CAM. Perception visualization provides a visual representation of what the DNN perceives in the input image by depicting what visual patterns the latent representation corresponds to. Visualizations are obtained through a reconstruction model that inverts the encoded features, such that the parameters and predictions of the original models are not modified. Results of our user study demonstrate that humans can better understand and predict the system's decisions when perception visualizations are available, thus easing the debugging and deployment of deep models as trusted systems.Comment: Accepted paper at BMVC 2021 (Proceedings not available yet

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Social Workers in International Relief and Development: A Natural Fit

Author: Bediako Andrea
Pittman Sharon
Rodgers Mark E
Sugawara Carman L
Publication venue: 'IEREK Research Enrichment and Knowledge Exchange'
Publication date: 21/04/2015
Field of study

This study sought to examine the compatibility between social work competencies with humanitarian assistance job skills requirements in the market. A systematic analysis of international job descriptions (N=500) was conducted with a focus on the skills required of potential employees. The main themes identified and operationalized into discrete skills and/or behaviors were: technical expertise, intra- and extra-organizational competencies, personal abilities, sector specialization, education, and language requirements. To aid educators in curriculum building, the identified skills were cross-referenced with the Council on Social Work Education’s Education Policy and Accreditation Standards practice behaviors to determine how they translate into standardized competencies. The study offers important implications for social work education and discusses several venues for social work employment in international relief and development careers. [AUTHOR ABSTRACT

Southern Adventist University

Alleviating Naive Bayes attribute independence assumption by attribute weighting

Author: Carman Mark J.
Cerquides Jesús
Webb Geoffrey I.
Zaidi Nayyar A.
Publication venue
Publication date: 24/05/2016
Field of study

Despite the simplicity of the Naive Bayes classifier, it has continued to perform well against more sophisticated newcomers and has remained, therefore, of great interest to the machine learning community. Of numerous approaches to refining the naive Bayes classifier, attribute weighting has received less attention than it warrants. Most approaches, perhaps influenced by attribute weighting in other machine learning algorithms, use weighting to place more emphasis on highly predictive attributes than those that are less predictive. In this paper, we argue that for naive Bayes attribute weighting should instead be used to alleviate the conditional independence assumption. Based on this premise, we propose a weighted naive Bayes algorithm, called WANBIA, that selects weights to minimize either the negative conditional log likelihood or the mean squared error objective functions. We perform extensive evaluations and find that WANBIA is a competitive alternative to state of the art classifiers like Random Forest, Logistic Regression and A1DE. © 2013 Nayyar A. Zaidi, Jesus Cerquides, Mark J. Carman and Geoffrey I. Webb.This research has been supported by the Australian Research Council under grant DP110101427 and Asian Office of Aerospace Research and Development, Air Force Office of Scientific Research under contract FA23861214030. The authors would like to thank Mark Hall for providing the code for CFS and MH. The authors would also like to thank anonymous reviewers for their insightful comments that helped improving the paper tremendously.Peer Reviewe

Digital.CSIC